early-exit strategy
COMSPLIT: A Communication-Aware Split Learning Design for Heterogeneous IoT Platforms
Ninkovic, Vukan, Vukobratovic, Dejan, Miskovic, Dragisa, Zennaro, Marco
The significance of distributed learning and inference algorithms in Internet of Things (IoT) network is growing since they flexibly distribute computation load between IoT devices and the infrastructure, enhance data privacy, and minimize latency. However, a notable challenge stems from the influence of communication channel conditions on their performance. In this work, we introduce COMSPLIT: a novel communication-aware design for split learning (SL) and inference paradigm tailored to processing time series data in IoT networks. COMSPLIT provides a versatile framework for deploying adaptable SL in IoT networks affected by diverse channel conditions. In conjunction with the integration of an early-exit strategy, and addressing IoT scenarios containing devices with heterogeneous computational capabilities, COMSPLIT represents a comprehensive design solution for communication-aware SL in IoT networks. Numerical results show superior performance of COMSPLIT compared to vanilla SL approaches (that assume ideal communication channel), demonstrating its ability to offer both design simplicity and adaptability to different channel conditions.
Understanding the Robustness of Multi-Exit Models under Common Corruptions
Mehra, Akshay, Seto, Skyler, Jaitly, Navdeep, Theobald, Barry-John
Multi-Exit models (MEMs) use an early-exit strategy to improve the accuracy and efficiency of deep neural networks (DNNs) by allowing samples to exit the network before the last layer. However, the effectiveness of MEMs in the presence of distribution shifts remains largely unexplored. Our work examines how distribution shifts generated by common image corruptions affect the accuracy/efficiency of MEMs. We find that under common corruptions, early-exiting at the first correct exit reduces the inference cost and provides a significant boost in accuracy ( 10%) over exiting at the last layer. However, with realistic early-exit strategies, which do not assume knowledge about the correct exits, MEMs still reduce inference cost but provide a marginal improvement in accuracy ( 1%) compared to exiting at the last layer. Moreover, the presence of distribution shift widens the gap between an MEM's maximum classification accuracy and realistic early-exit strategies by 5% on average compared with the gap on in-distribution data. Our empirical analysis shows that the lack of calibration due to a distribution shift increases the susceptibility of such early-exit strategies to exit early and increases misclassification rates. Furthermore, the lack of calibration increases the inconsistency in the predictions of the model across exits, leading to both inefficient inference and more misclassifications compared with evaluation on in-distribution data. Finally, we propose two metrics, underthinking and overthinking, that quantify the different behavior of practical early-exit strategy under distribution shifts, and provide insights into improving the practical utility of MEMs.